2 research outputs found
Complexity Analysis Of Next-Generation VVC Encoding and Decoding
While the next generation video compression standard, Versatile Video Coding
(VVC), provides a superior compression efficiency, its computational complexity
dramatically increases. This paper thoroughly analyzes this complexity for both
encoder and decoder of VVC Test Model 6, by quantifying the complexity
break-down for each coding tool and measuring the complexity and memory
requirements for VVC encoding/decoding. These extensive analyses are performed
for six video sequences of 720p, 1080p, and 2160p, under Low-Delay (LD),
Random-Access (RA), and All-Intra (AI) conditions (a total of 320
encoding/decoding). Results indicate that the VVC encoder and decoder are 5x
and 1.5x more complex compared to HEVC in LD, and 31x and 1.8x in AI,
respectively. Detailed analysis of coding tools reveals that in LD on average,
motion estimation tools with 53%, transformation and quantization with 22%, and
entropy coding with 7% dominate the encoding complexity. In decoding, loop
filters with 30%, motion compensation with 20%, and entropy decoding with 16%,
are the most complex modules. Moreover, the required memory bandwidth for VVC
encoding/decoding are measured through memory profiling, which are 30x and 3x
of HEVC. The reported results and insights are a guide for future research and
implementations of energy-efficient VVC encoder/decoder.Comment: IEEE ICIP 202
BLINC: Lightweight Bimodal Learning for Low-Complexity VVC Intra Coding
The latest video coding standard, Versatile Video Coding (VVC), achieves
almost twice coding efficiency compared to its predecessor, the High Efficiency
Video Coding (HEVC). However, achieving this efficiency (for intra coding)
requires 31x computational complexity compared to HEVC, making it challenging
for low power and real-time applications. This paper, proposes a novel machine
learning approach that jointly and separately employs two modalities of
features, to simplify the intra coding decision. First a set of features are
extracted that use the existing DCT core of VVC, to assess the texture
characteristics, and forms the first modality of data. This produces high
quality features with almost no overhead. The distribution of intra modes at
the neighboring blocks is also used to form the second modality of data, which
provides statistical information about the frame. Second, a two-step feature
reduction method is designed that reduces the size of feature set, such that a
lightweight model with a limited number of parameters can be used to learn the
intra mode decision task. Third, three separate training strategies are
proposed (1) an offline training strategy using the first (single) modality of
data, (2) an online training strategy that uses the second (single) modality,
and (3) a mixed online-offline strategy that uses bimodal learning. Finally, a
low-complexity encoding algorithms is proposed based on the proposed learning
strategies. Extensive experimental results show that the proposed methods can
reduce up to 24% of encoding time, with a negligible loss of coding efficiency.
Moreover, it is demonstrated how a bimodal learning strategy can boost the
performance of learning. Lastly, the proposed method has a very low
computational overhead (0.2%), and uses existing components of a VVC encoder,
which makes it much more practical compared to competing solutions